Molecular packing and chemical association in liquid water 
simulated using ab initio hybrid Monte Carlo and different 
exchange-correlation functionals 

Valery WebeiQ 

Physical Chemistry Institute, University of Zurich, 8057 Zurich, Switzerland. 

Safir Merchant, Purushottam D. Dixit, and D. AsthagirQ 

Department of Chemical and Biomolecular Engineering, 
Johns Hopkins University, Baltimore, Maryland 21218, USA. 
(Dated: April 20, 2010) 



1 



Abstract 

In the free energy of hydration of a solute, the chemical contribution is given by the free energy 
required to expel water molecules from the coordination sphere and the packing contribution is 
given by the free energy required to create the solute-free coordination sphere (the observation 
volume) in bulk water. With the SPC/E water model as a reference, we examine the chemical 
and packing contributions in the free energy of water simulated using different electron density 
functionals. The density is fixed at a value corresponding to that for SPC/E water at a pressure 
of 1 bar. The chemical contribution shows that water simulated at 300 K with BLYP is somewhat 
more tightly bound than water simulated at 300 K with the revPBE functional or at 350 K with 
the BLYP and BLYP-D functionals. The packing contribution for various radii of the observation 
volume is studied. In the size range where the distribution of water molecules in the observation 
volume is expected to be Gaussian, the packing contribution is expected to scale with the volume of 
the observation sphere. Water simulated at 300 K with the revPBE and at 350 K with BLYP-D or 
BLYP conforms to this expectation, but the results suggest an earlier onset of system size effects 
in the BLYP 350 K and revPBE 300 K systems than that observed for either BLYP-D 350 K 
or SPC/E. The implication of this observation for constant pressure simulations is indicated. For 
water simulated at 300 K with BLYP, in the size range where Gaussian distribution of occupation is 
expected, we instead find non-Gaussian behavior, and the packing contribution scales with surface 
area of the observation volume, suggesting the presence of heterogeneities in the system. 

Keywords: quasichemical theory, scaled particle theory, potential distribution theorem, coordination num- 
bers, molecular dynamics 
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I. INTRODUCTION 



The structure of nonassociated liquids such as liquid nitrogen or liquid argon can be 
understood in terms of packing of molecules pQ|: the structure is primarily determined 
by hard-core repulsive (excluded-volume) interactions and not by specific, directional inter- 
molecular interactions. Further, the thermodynamics of such fluids admits a mean-field (van 
der Waals type) approximation p] (see also Fig. 1 in Ref. j2j). In contrast, for an associated 
liquid like water, the attractive interactions are strong and specific, revealing themselves, 
for example, in the approximately tetrahedral ordering of water molecules around a central 
water molecule [3J. In such a case, the observed structure and thermodynamics reflects both 
packing interactions and local, chemically specific interactions between molecules. To under- 
stand the structure and thermodynamics of liquid water, it is thus imperative to understand 
the balance between packing effects and local, attractive interactions. 

Recent developments in molecular statistical thermodynamics (2], HH9] allow a detailed 
examination of the competing roles of packing and local, chemically involved interactions in 
the physics of hydration. These efforts are founded on regularizing [9J the statistical problem 
of calculating the excess chemical potential of the solute using the potential distribution 
theorem|5j [10]. By introducing a spatial scale — here the coordination radius of interest 

- the interaction of the solute with the solvent is separated into a local, chemically inter- 
esting piece and a long-range piece. The coordination radius can be adjusted such that the 
distribution of energies from the long-range interaction piece admits a simplified description 
[2j El |9]. This then helps focus the attention on the local problem solely. 

The local contributions to hydration are accounted for by the work of expelling the water 
molecules from within the coordination sphere in two limits, one in the presence of solute- 
solvent interactions and the other with those interactions turned off. The former gives the 
chemical contributions to hydration whereas the latter accounts for packing contributions. 
(We will refer to the coordination sphere without the solute as the observation volume.) 
From a simulation record, the chemical contribution can be obtained by noting the proba- 
bility, xq, of observing no water molecules in the coordination sphere. Likewise, the packing 
contribution is obtained from the probability, po, of observing an empty observation volume 

- a cavity — in bulk water. The x$ and po values are, respectively, the n = members 
of the set {x n } and {p„} of occupancy number (n) distributions in the coordination sphere 
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and the observation volume. 

For coordination radii that are chemically meaningful for the solute of interest, the distri- 
bution of coordination states {x n } below the most probable coordination state indicate the 
relative contribution of those states to the local, chemically involved contributions to hydra- 
tion [TT] . Likewise the distribution {p n } provides important insights into the hydrophobic 
aspects of hydration [T214T4"] . For observation volumes with radii in the range 2.0-3.5 A, {p n } 
is found to be nearly Gaussian [7] [T2T[T4"] . In this case, the variance of {p n }, and hence also 
the excess chemical potential of the cavity, is expected to scale linearly with the volume of 
the cavity [13] . (This scaling relation proves insightful in the analysis below.) 

Earlier, for a classical, empirical model of liquid water, based on an analysis of the {x n } 
and {p n } distributions it was found that the packing and chemical contributions balance at 
a coordination radius of about 3.3 A [2j [7J. At that point, the net hydration free energy is 
entirely determined by non-specific interactions between the water molecule and the bulk 
liquid outside the coordination sphere [2J. A similar conclusion was reached for liquid water 
simulated with the revPBE functional within constant NVE ab initio molecular dynamics 
[6] . In that study, because of limited data, using the more robustly determined mean and 
variance of the number distributions, a maximum entropy approach was used to secure xq 
and po [6j. Further, relative to water simulated with the revPBE functional, it was found 
that the enhanced structure obtained using the PBE functional correlated with attractive 
interactions outweighing packing effects. 

There were two main limitations in the earlier ab initio molecular dynamics study [5J. 
First, the simulations with the revPBE and PBE functionals were at different average tem- 
peratures (314 K and 337 K, respectively), confounding any clear comparison between the 
functionals. Second, the total simulation time was small, being less than 15 ps for any one 
functional under study. In the present work, we address both those limitations. We imple- 
ment a hybrid Monte Carlo (HMC) method [ToTl2"0"] . an approach that guarantees that we 
sample from a canonical ensemble without in any way influencing the forces obtained using 
the functional under study. The simulations are also conducted for a longer time (about 
170 ps). 

In the following section, we briefly summarize the statistical thermodynamic theory and 



recapitulate main ideas of the HMC method. In Section III we outline the methods used. 
For the ab initio simulations, we implement the HMC method as a script that interfaces with 
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the publicly available CP2K code [21]. In Section IV we present the results of our study. 
For water simulated with BLYP, the packing contribution shows a scaling behavior that is 
not expected of a homogeneous liquid material, whereas it does so for water simulated with 
revPBE at 300 K and with BLYP and BLYP-D at 350 K. With scaled particle theory as a 
reference [22], these results suggest that the density (pressure) of the liquid simulated with 
BLYP at 350 K and revPBE at 300 K is somewhat higher than normal. Overall, among 
the models studied here, water simulated with BLYP-D at 350 K appears to best describe 
liquid water at 300 K. 

II. THEORY 

A. Statistical thermodynamic theory 

Consider a solute X in a bath of water molecules and define a coordination sphere of 
radius A around the solute. In the n-coordinate state of the solute, there are exactly n 
water-oxygens in the coordination sphere. The probability of the n-coordinate state, x n , 
is the fraction of observing such cases. In the absence of the solute-solvent interactions, 
the probability of n water-oxygen atoms to be found within the observation volume is p n . 
(Recall that the observation volume refers to the coordination sphere without the solute.) 
The probabilities x n and p n are related by [TT], [23] 

Xn = , (1) 

where = k-^T and T is the temperature and fee is the Boltzmann constant. The excess 
chemical potential of the solute, /iff, is the chemical potential in excess of the ideal gas result 
at the same density and temperature of the solute X in the solution, and /iff (n) is the excess 
chemical potential with the added constraint that only n-water molecules are present in the 
coordination sphere. Observe that in Eq. [TJ p n defines the intrinsic propensity of solvent 
molecules to populate the observation volume. Solute-solvent interactions codified by the 
quantity /iff (n) — /iff modify this intrinsic or bare probability to give x n . 

Eq. [T] can be rigorously established [TT1 123] - here it suffices to note that the normal- 
ization of probabilities {x n } and {p n } lead to well-established multi-state generalizations of 
the potential distribution theorem [5j [11] [231 121] • The physical content of Eq. [I] is best seen 
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for the n = case: 

/3/4 x = lnx - lnp + (5^{n = 0). (2) 

The packing contribution, /3/zhs = — hip 0) is the free energy of forming an empty observation 
volume of a defined radius, a hard sphere (HS), in bulk water. The interaction free energy of 
the solute in the center of an empty coordination sphere is /3//|f(n = 0); the operation is one 
of placing the solute at the center of the cavity. Finally, the free energy change in allowing 
water to flood the empty coordination sphere is lnxo; thus this term measures the role of 
the specific, directional bonding between the solute and the solvent within the coordination 
sphere. (The relation x — VS-^WCj where K n is the equilibrium constant for forming 
a solute plus n-water cluster within the coordination sphere and p w is the density of water 
[H 13 El [HI |25], provides an alternative perspective on the chemical contributions to xq.) 

B. Hybrid Monte Carlo 

The hybrid Monte Carlo (HMC) method combines elements of Monte Carlo and molecular 
dynamics approaches and has been well-established in the literature [T61 - I20] . We briefly 
recapitulate the main ideas of this method here. 

We assume, as is the case here, that the iV-particle system is described by a classical 
Hamiltonian T-L(q,p) = T{p) +U(q). T and U are, respectively, the usual kinetic and 
potential energy, and q are the coordinates and p the conjugate momenta. U(q) is obtained 
using electron density functional theory, and we are interested in the canonical distribution 
of configurations of this system. 

The canonical configurations of this system can be generated by conventional (Metropolis) 
Monte Carlo: we make a one-particle move and accept or reject that move based on the 
standard Metropolis criterion. The Metropolis criterion also specifies the temperature of 
the system. To better explore configurational space it would be desirable to make multiple 
moves at a time, but such an approach is rather inefficient in conventional Monte Carlo. The 
HMC method removes this inefficiency by combining the effectiveness of molecular dynamics 
to generate iV-particle (global) moves with the effectiveness of Monte Carlo in rigorously 
generating a canonical distribution [ToTl2T)] . 

One sweep of the HMC consists of N m d molecular dynamics time steps of propagation 
through phase space starting from the configuration (q,p). The dynamics are performed 
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at constant energy using a time reversible and area preserving discretization scheme and a 
time-step of St. Initial momenta p are assigned from a Gaussian distribution at the inverse 
temperature /3. At the end of N md time-steps a candidate phase space configuration (q',p') is 
generated. If SH = H{q',p') — 7~L{(l,p) is the discretization error, then the new configuration 
is accepted with a probability 

P A = mm{l,e-P SH } . (3) 

The above procedure guarantees that detailed balance is satisfied [T6HT9] . 

There are two helpful properties of the hybrid Monte Carlo method that can serve as 
consistency checks of the simulation. First, the area preserving property implies that [T8| [T9] 

(e"^> = 1, (4) 

where (...) denote a canonical average. Second, provided the third and higher cumulants 
of Eq. [4] vanish, as happens when N is large and 8t is small such that the variance of the 
distribution of ST-L is a constant, then [T81 IT9] 

P A = erfc(^v^H) . (5) 

In the hybrid Monte Carlo procedure, the acceptance rate depends on St and N md . Here, 
in the initial phase of the simulation, we fix N m( i and block average data every 10 sweeps to 
determine the acceptance ratio. If the acceptance ratio is greater (lesser) than the targeted 
ratio, 5t is increased (decreased) by 10% to target the specified acceptance ratio. In the next 
phase, the time step 5t is held constant, ensuring that strict detailed balance is satisfied. 
Only results from the latter phase are reported here. 

III. METHODOLOGY 

A. Classical MC simulation 

The SPC/E water model [26] was taken as a reference for the structure and coordination 
number distributions of principal interest in this work. This choice was motivated by several 
considerations: (1) we seek coordination number distributions, and hence using a model is 
inevitable; (2) the oxygen-oxygen pair-correlation function, goo(r), obtained using SPC/E 
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is in reasonable agreement with the current best experimental results using the Advanced 
Light Source [27] experiment (see below); and (3) the SPC/E model is known to well-describe 
the liquid- vapor phase boundary of water E2] (see also Refs. [30] and [3~T]). 

A 32 water molecule system at a number density p = 33.33 nm~ 3 (mass density of 
0.997 g/cm 3 ) corresponding to 1 bar pressure [30] was simulated using Metropolis Monte 
Carlo [32]. Long-range electrostatic interaction was described using Ewald summation and 
Lennard- Jones interactions were truncated at half the box-length. After an initial equilibra- 
tion of 3 x 10 5 sweeps, where one sweep is one attempted move per particle, data collected 
over an additional 3 x 10 5 sweeps was used for analysis. The structure and thermodynamics 
from the small system are in excellent agreement with results using larger systems (data not 
shown) . 

B. Classical HMC simulations 

For our initial studies with the HMC method, we simulated the classical, flexible SPC/Fw 
[33] model of water. (We consider HMC simulations with the flexible model, as this is 
of most interest in ab initio simulations.) The HMC method was implemented using a 
Python script that computes 8T-L between N m d steps of dynamics and appropriately archives 
coordinates and then initiates the next sweep of HMC. Each sweep is initiated using the 
current coordinates and velocities assigned from a Gaussian distribution at the reciprocal 
temperature (3. In assigning velocities, care is taken to ensure that the system does not 
have a net translational momentum. The coordinates and velocities are then handed to 
the molecular dynamics program. Here the molecular dynamics part of the simulation was 
performed using NAMD |34j . 

We first considered a system with 64 water molecules at a particle density of p = 
33.33 nm -3 . The initial oxygen coordinates were obtained from the coordinates of a sys- 
tem of hard-spheres at a reduced density (per 3 ) of 0.3. We purposely chose a poor initial 
configuration to estimate how well the goo( r ) converges to that obtained using a Langevin 
dynamics simulation of a well-equilibrated system. N md = 10, 20, 50, 75, and 100 were con- 
sidered. To compare with the ab initio simulations, we also considered a 32 particle system 
and N m d = 50. The initial configuration of the 32 particle system was obtained from an 
equilibrated configuration of SPC/E water molecules. 
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C. Ab initio HMC simulations 

The molecular dynamics part of the simulations were performed using the publicly avail- 
able cp2k code [21J. The HMC method was implemented as a script as discussed above. 

The cp2k code uses the Gaussian plane wave (GPW) method [211 E5] based on the 
Kohn-Sham formulation of density functional theory together with a hybrid Gaussian and 
plane wave basis. The norm-conserving pseudopotentials of Goedecker-Teter-Hutter [361 EZ] 
(GTH) and a triple-^ valance basis set augmented with two sets of cZ-type or p-type polariza- 
tion function (TZV2P) were used throughout; our choice follows several recent studies using 
the same basis [3"81-43j . A 280 Ry cutoff for the auxiliary plane wave grid was employed, 
and the efficient and numerically stable orbital transformation energy minimizer introduced 
in Ref. [44] is used to converge the SCF iterations to 10 -6 a.u. of the Born-Oppenheimer 
surface. The nuclei are propagated by the velocity Verlet [32] algorithm. Standard masses 
were used for hydrogen and oxygen. The simulation system comprises 32 water molecules at 
a number density of 33.33 nm~ 3 . The initial configuration was obtained from an equilibrated 
configuration of SPC/E water molecules. 

The electronic structure was solved using the BLYP [151110], revised PBE (revPBE) [IT] , 
and the BLYP-D [48] generalized gradient approximations to density functional theory. 
The BLYP-D functional includes an empirical correction (denoted by '-D') for dispersion 
interactions. Following a recent study [13], a cutoff of 48A was used for the empirical 
dispersion contribution. 

For all the functionals we report data using N m d = 50. (This choice is explained below. 
Test calculations with revPBE and N m d = 10, 20, 75 show, as expected [19], the insensitivity 
of the results to choice of N m d.) 

D. Temperature effects 

In the present study we explore the classical statistical mechanics of liquid water, and an 
important element missing from these studies is the effect of proton nuclear quantum effects 
on weakening intermolecular bonding. Earlier studies, for example Refs. [19TI5*2"] . of water 
using empirical interaction potentials find that increasing the temperature by about 50 K 
mimics the effect of including proton nuclear quantum effects. A more recent study finds 
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less dramatic quantum effects if molecular flexibility is considered and the parameters of the 
empirical model optimized with quantum effects [53J. 

In this work, we regard temperature as a convenient parameter to change the effective 
strength — 014 is the pertinent quantity in sampling configurations — of interaction ob- 
tained using a functional. In this study, we perform simulations at 300 K and 350 K, with 
the system simulated at the latter temperature mimicking the effect of weaker intermolec- 
ular interactions. Below, results using a particular functional and a given temperature are 
labelled by 'functional temperature' combination. 

E. Statistical uncertainties 

Throughout this work, statistical uncertainties were estimated following the block trans- 
formation procedure developed in Allen and Tildesley |32j, following the earlier work of 
Friedberg and Cameron [M] • In the case of the pair-correlation, each bin was treated as a 
separate channel for data and the error analysis was performed on the counts obtained in 
each channel. For obtaining uncertainties in x n and p n , the appropriate number of instances 
per frame was used as the data stream and errors estimated. A similar approach was used 
for estimating uncertainties in (SH) and {e~^ SH ). 

IV. RESULTS AND DISCUSSION 

A. HMC with SPC/Fw model and choice of N md 

Figure [I] compares the structure of the SPC/Fw water obtained using the HMC and 
Langevin dynamics approaches. Results based on other choices of N m d overlap those noted 
in Figure [T] and are not shown for clarity. 

A reasonable choice of N m d can be inferred from the velocity autocorrelation time [55] 
r vac = (3rnV, where m is the mass of the particle and T> the diffusion coefficient. r vac is the 
time by which velocities become uncorrelated and diffusive motion takes hold. Thus beyond 
r vac , configurations are explored by a less efficient diffusive motion, a situation that we seek 
to avoid in the HMC. For liquid water, V « 0.25 A 2 /ps at 300 K (56j [57] and r vac « 200 fs. 

In the HMC procedure, by design, the velocities become uncorrelated every N m d time 
steps, and for a fixed integration time-step (St), the auto-correlation time r vac is directly 
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proportional to N m d- Thus St ■ N m d very small compared to r vac is akin to exploring config- 
urational space diffusively and is not to be preferred. On the other hand, a large St ■ N m d 
can lead to larger integration errors (STL) and lower acceptance rates. For these reasons, we 
conservatively choose N m d = 50 (~ r vac /4 for an integration time step of 1 fs) for all our 
simulations. 

B. HMC with ab initio models 

In Table [i] we collect several key metrics. Clearly, (e~ l3S ' H ) ~ 1, and the value of the 
acceptance rate predicted by Eq. [3] is also close to the value actually found. Since St ~ 1 fs, 
simulations with each density functional extended beyond about 170 ps. In the first 2000 
sweeps (about 100 ps), St was varied to target an acceptance rate around 70%. In the 
subsequent nearly 1400 sweeps (about 70 ps), St was held fixed. Of these, 500 sweeps 
(about 25 ps) were set aside for further equilibration and the remaining used in analysis. 

In Figure [2] the radial density distribution of water oxygen is shown for the different 
functionals and temperatures considered here, and in Figure [3] we compare the results of the 
current BLYP simulations with some of the earlier results based on the same (CP2K) code 
and basis set (GTH-TZV2P). Comparing BLYP 300 K and BLYP 350 K, we find that at 
the higher temperature the first maximum is lowered by about 0.2 (Table [T]) and the first 
minimum is likewise elevated. The change with temperature is in the right direction. 

In comparing the present results to other simulation results, some aspects need to be em- 
phasized. First, the response to temperature will sensitively depend on the water model (for 
example, see [19]). Second, the response will sensitively depend on the simulation ensem- 
ble, especially when small system sizes are involved. With these caveats, observe that the 
location and magnitude of the first maximum for BLYP 300 K falls between NVE ensemble 
results [10] reported at an average temperatures of 292 K and 318 K. (The uncertainty in 
temperature was reported to be about ±10 K around the average temperature [40J.) How- 
ever, the agreement is less encouraging at 350 K (Figure [3j right panel). The dependence of 
the thermal and the mechanical response of the liquid on the thermodynamic state point may 
underlie the observed difference. For example, the thermal expansion coefficient of water 
increases with temperature [58] and as does the compressibility beyond about 320 K. (This 
likely also explains why in this study for revPBE 300 K a more structured pair-correlation 
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function is obtained in comparison to the earlier NVE ensemble study [6] of a 32 water 
molecule system at a temperature of 314 ± 20 K.) 

Comparing the present results to those from an earlier Monte Carlo study, M05 in Figure [3] 
[39] (and also Ref.[3E]), suggests that the peak of the correlation function obtained in that 
study is somewhat lower. (The location of the peaks are nearly the same.). Several factors 
may underlie the observed differences. First, the temperatures are different (Figure [3]). 
Second, a larger system was used in the earlier study [38| [39] . The impact of system size 
will sensitively depend on the water model used. (We will return to this aspect below 
in discussing Figure [6]) For example, with the SPC/E water model, the pair-correlation 
approaches the bulk almost after the first hydration shell (Figure [2]) and both 32 and 64 
particle simulation cells give essentially identical pair-correlation functions. But as Figure [2] 
suggests, the correlations appear more pronounced for water simulated with BLYP and these 
correlations can be expected to persist for longer length scales leading to more pronounced 
system size effects. Third, the different methodologies may be a factor. In the earlier Monte 
Carlo study [Ml ES], an empirical potential was used as an importance function to sample 
configurations [59] . (That empirical potential was parameterized against a Car-Parrinello 
simulation [60J.) But it is not clear if the empirical model was a good reference model[61J 
for the target system studied. 

C. Number distributions 

Figure [4] (left panel) shows the distribution of coordination numbers observed around 
a distinguished water molecule. Fig. [4] (right panel) depicts the variation of the chemical 
contribution to the excess chemical potential for various coordination radii. 

For a chemically reasonable coordination radius, the coordination states below the most 
probable coordination state reveal the importance of local interactions [H] . As Figure [4] 
(left panel) suggests, relative to SPC/E, the probability of the n < 4 states drops sharply 
for BLYP 300 K. This suggests that BLYP 300 K leads to a somewhat tighter binding of 
the distinguished water molecule to the neighboring water molecules. If the local, cohesive 
interactions were weaker, we expect to observe a greater proportion of the n < 4 states. 
Comparing BLYP 300 K and BLYP 350 K shows that weakening the local, cohesive interac- 
tions does indeed elevate the proportion of the n < 4 states. Comparing In xq for chemically 
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meaningful coordination radii shows that the work of expelling water molecules from within 
the coordination shell is more for BLYP 300 K than for any other case. Further, across 
all coordination states and coordination radii, the BLYP-D 350 K best follows the SPC/E 
results. 

Figure [5] (left panel) gives the occupation statistics in an observation volume of radius 3 A. 
Comparing BLYP 300 K and BLYP 350 K shows that increasing the temperature makes it 
harder to open a cavity in liquid water. This is as expected, since the disorder in the medium 
increases with increasing temperature. (As an aside, this observation also explains why the 
solubility of hydrophobic solutes decreases with increasing temperature [14].) Including 
additional attractive interactions mitigates the effect of increasing temperature, a trend 
more clearly seen in the behavior of lnpo (Figure [5j right panel). 

D. Scaling of the packing contribution 

To facilitate the discussion below, we first collect several observations about occupancy 
number distributions, the predicted scaling of the packing contribution based on theory, and 
number distributions and system size effects. 

In liquid water, for observation volumes of radii between about 1.5 A to 3.5 A, the 
occupation number distribution is found to closely approximate a Gaussian IT^HTo] . (The 
approximation is much better for the smaller radii.) Physically this means that at the size- 
scale of the observation volume, the presence of one molecule anywhere in the observation 
volume has only a modest (or little) effect in inducing the presence of another molecule in the 
volume. But as we increase the size of the observation volume, more of the network structure 
of water JU2H01] comes into focus, and the occupation number is no longer Gaussian. In 
this instance, the presence of one water does induce the presence of another molecule in the 
volume. 

Theory [131 US] suggests that when the occupation number is approximately Gaussian, 
the excess chemical potential of the empty observation volume (the cavity) scales with the 
volume of the cavity. On the other hand, the excess chemical potential of a large cavity 
depends on the surface area of the cavity. This scaling arises because at the large length 
scale (> 10 A), the free energy is effectively the work it takes to create the interface. For 
the larger cavities, the preference for preserving hydrogen-bonding outweighs the need to 
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accommodate the cavity [15] in the liquid matrix. 

The occupation number distributions will be sensitive to system size effects. The number 
variation reflect positional fluctuations (of the molecules) and these will always be con- 
strained in a constant volume simulation. In practice we find (Merchant and Asthagiri, in 
preparation) that system size effects manifest for observation spheres with a volume greater 
than about 3% of the box volume. (For system sizes considered here, this amounts to an 
observation sphere of radius ~ 2 A.) For cavities larger than this size, the excess chemical 
potential will be more positive than what would be obtained at constant pressure and/or 
large system sizes. The revised scaled particle theory [22J provides the packing contributions 
for cavities of various sizes in the large system limit. 

From Figure [6j we find that the packing contribution (/3/iffs = — m po) ^ or BLYP 350 K 
and revPBE 300 K is above the scaled particle result for A > 2.4 A. The dependence 
of — /3A~ 2 /i|f s on A is still linear, but the rate of increase is greater than that predicted 
theoretically. This indicates the onset of system size effects, and suggests that the pressure 
in the fluid is higher than normal. Thus in a constant temperature and pressure simulation 
of these systems, we would expect a lower density. A recent report [13] using the BLYP 
functional at a temperature of 330 K and a pressure of 1 bar does indeed find a lower 
density for the fluid. The present results are consistent with that observation. (Although 
the temperature in that study was lower than the one here, we suspect that the above noted 
trend will hold.) Compared to both BLYP 350 K and revPBE 300 K, system size effects set 
in somewhat later in BLYP-D 350 K, just as they do for SPC/E. 

The trend for BLYP is striking. For A between 2.5 to 3.0 A, the packing contribution 
scales with the surface area of the cavity, a scaling behavior that is not predicted to arise 
until after A ~ 10 A. (At that size scale, this behavior reflects the need for water molecules 
to maximize their bonding by de- wetting the interface [15].) This unexpected behavior 
clearly reflects non-Gaussian occupation statistics (cf. Figure [5j left panel), and suggests the 
presence of heterogeneities, perhaps strongly bonded pairs or other such clusters of water 
molecules, in the liquid. (The more negative chemical contribution in BLYP 300 K water 
supports this suggestion.) On the basis of temperature effects of water simulated with BLYP, 
an earlier report |40j found deviations of the diffusion coefficient from an Arrhenius behavior 
at temperatures around 300 K. Heterogeneities in the liquid can lead to such behavior and 
our results appear to corroborate the earlier study. A thorough analysis of the nature of 
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heterogeneities/network structure of the liquid [62H64"] will require a much larger system, 
preferably one where there are at least 3-4 hydration layers around a central water. Finally, 
for a somewhat larger cavity, A > 3.1 A, — /3X~ 2 fj^ s once again begins to increase with A, as 
must be expected, since the volume of the system is constant. 

E. Balance of chemistry and packing 

Figure [7] (left panel) shows the sum of the inner-shell chemistry and packing contributions 
obtained using the simulation data (Figs. [4] and [5]). Figure [7] (right panel) are information 
theory[6] estimates. Consistent with the above discussion, we find that local chemistry 
outweighs packing for BLYP 300 K. As an extreme approximation, if we assume that outer 
contribution is same for all the cases, then on the basis of the simulations we expect that 
the excess chemical potential of water simulated with BLYP is about 4 k^Ts more negative 
than the SPC/E value (= —7.2 kcal/mol, Merchant and Asthagiri, unpublished). For all the 
other cases, relative to the SPC/E value, the excess chemical potential will be more positive 
by about a k^T. 

Information theory predictions are only qualitatively consistent with the data; for ex- 
ample, BLYP is predicted to be more strongly bound than BLYP-D 350 K. However, the 
quantitative deviations from actual data can be as high as 5 k-^T (cf. BLYP 350 K, right 
and left panels of Fig. [7]). But this is not surprising given that {x n } (Fig. |4j left panel) 
is non-Gaussian (see also[7], and the two-moment information theory model will only be 
approximate. 

V. CONCLUSIONS 

The free energy of expelling water molecules from the coordination sphere — the chemical 
contribution to hydration — of a distinguished water molecule was calculated for liquid 
water simulated at 300 K with revPBE and BLYP functionals and at 350 K with BLYP and 
BLYP-D functionals. From this calculation we find that the distinguished water molecule is 
somewhat more tightly bound in water simulated with BLYP at 300 K than for the other 
cases. 

The hard-sphere packing contribution per unit surface area, f3\~ 2 fi^ s , of the hard-sphere 
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was obtained for various radii (A) of the hard-sphere and the results compared with those 
based on the revised scaled particle theory. Except for BLYP 300 K, /3A~ 2 /iHS sc& l es linearly 
with A, but shows distinct domains of linearity, a behavior consistent with the Gaussian 
occupancy statistics in an observation volume of the same size as the hard sphere. 

For revPBE 300 K and BLYP 350 K and A > 2.4 A, /3X~ 2 fif s increases faster than the 
scaled particle predictions and indicates the onset of system size effects earlier than it does 
for SPC/E. This shows that at density of 0.997 gm/cc (corresponding to 1 bar pressure in 
the SPC/E model), the pressure in these systems is higher than 1 bar. 

For BLYP 300 K, /3A~ 2 /iHS is independent of A in the 2.4-3.0 A size-range. This behavior 
is not expected of liquid water at this size range and suggests the presence of heterogeneities 
in the medium. 
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TABLE I: Statistics from the HMC simulations of water. The temperature T = 1/(A;b/3). N m d = 50 
is the number of molecular dynamics steps between consecutive Monte Carlo steps, r is the total 
number of sweeps. In the first 2000 sweeps (~ 100 ps) of ab initio simulations, St was periodically 
updated to target an acceptance rate of 0.7. Subsequently, St was held fixed at the value shown, 
and the first 500 sweeps of the total time shown were set aside for equilibration. For the SPC/Fw 
simulation, St = 1 fs was used always and the first 2000 sweeps were set aside for equilibration. 
St is the time step for integrating the equations of motion. {135%) is the average discretization 
error between consecutive Monte Carlo steps. The expected value of (e - ^* 5 ^) is 1. R is the ratio of 
the observed acceptance rate to that predicted by Eq. |3j Statistical uncertainties at the la level 
are noted, except for the goo( r ) where it is given at the 2a level, r is the distance to the first 
maximum of the goo( r )- Note that r is defined to only within half the bin-width of 0.04 A. The 
simulation system comprises 32 water molecules. 





T 


T(K) 


((35-H) (e~P sn ) 


St (fs) 


R 


r (A) 


goo(r) 


SPC/Fw 


4000 


300 


0.20 ±0.01 1.00 ±0.01 


1.00 


1.00 


2.74 


3.21 ±0.06 


BLYP 


1463 


300 


0.35 ±0.06 1.03 ±0.03 


1.16 


1.04 


2.74 


3.42 ±0.09 


BLYP 


1459 


350 


0.31 ±0.03 1.08 ±0.04 


1.12 


0.99 


2.74 


3.24 ±0.13 


BLYP-D 


1400 


350 


0.26 ±0.02 1.01 ±0.03 


1.06 


0.97 


2.74 


3.12 ±0.08 


revPBE 


1391 


300 


0.26 ±0.03 1.03 ±0.05 


1.02 


1.01 


2.74 


3.37 ±0.08 
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FIG. 1: Structure of SPC/Fw water at 300 K using the HMC and Langevin dynamics approaches. 
The results are insensitive to the choice of N m ^ and only results with N m d = 50 are shown. For 
the N m d = 50 simulation, 4 x 10 3 sweeps were performed and the last 2 x 10 3 sweeps were used 
for averaging. A fixed time step of 8t = 1 fs was used. For the HMC simulation with N = 64, the 
starting oxygen positions were based on those of a hard-sphere fluid at a reduced density of 0.3. 
Results using N = 32 particles overlap those shown and are not included to preserve clarity. A 
bin-width of 0.04 A has been used for analyzing the data. 
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FIG. 2: Oxygen-oxygen radial distribution function at 300 K (unless otherwise noted). goo( r ) 
for SPC/E (dotted line) is used as the standard for comparing different functionals. Data using 
SPC/Fw is similar to that from SPC/E and is not shown. A bin- width of 0.04 A has been used 
for analyzing the data. 
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FIG. 3: Oxygen-oxygen radial distribution function using the BLYP density functional. Right 
panel: BLYP at 300 K (blue curve). Left panel: BLYP at 350 K (light blue curve). The statistical 
uncertainties at the 2a level are shown. All calculations are based on the cp2k code, the GTH- 
TZV2P basis set, and cutoff (280 Ry) for the charge density grid (except where noted). M05: 
Monte Carlo, with sampling of configurations using an empirical potential; a 1200 Ry cutoff for the 
charge density grid was used and the simulation comprises 64 water molecules |39j . V05: NVE 
Born-Oppenheimer molecular dynamics [30] on a 32 water molecule system. Uncertainty of about 
±10 K was reported around the average temperature noted in the figure [40] . A bin- width of 0.04 A 
has been used for analyzing the data. 
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FIG. 4: Left panel: Distribution of coordination numbers around a distinguished water molecule. 
The coordination radius is 3.0 A. The curves are translated vertically for clarity. The SPC/E {x n } 
distribution is overlain on the results from each of the density functional to facilitate compari- 
son. Right panel: Variation of the chemical contribution with coordination radius A. Statistical 
uncertainty at the la level is shown. 
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FIG. 5: Left panel: Distribution of water molecules in a cavity of radius 3 A. The curves are 
translated vertically for clarity. The SPC/E {p n } distribution is overlain on the results from each 
of the density functional to facilitate comparison. Right panel: Variation of the packing contribution 
with coordination radius A. Statistical uncertainty at the la level is shown. 
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FIG. 6: The packing contribution scaled by the surface area (to within constants) versus the radius 
of the hard sphere. Compare also with Figure 2 in Ref.[T3]. The solid curve is based on the revised 
scaled particle theory of Ashbaugh and Pratt |22j . A linear fit to the scaled particle results for 
1.5 < A < 3.5 is shown (dotted line). 
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FIG. 7: Left panel: The sum of inner-shell chemistry and packing (steric) contributions to the 
excess free energy of hydration (in units of k&T). Right panel: The sum of inner-shell chemistry 
and packing inferred from a two moment information theory model, that is assuming Gaussian 
distribution of {x n } and {p n } [5} H2HH] . 
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